Combination of Fibrosis-4, liver-stiffness measurement, and Fibroscan-AST score to predict liver-related outcomes in nonalcoholic fatty liver disease

Introduction: Noninvasive tests, such as Fibrosis-4 (FIB-4), liver-stiffness measurement (LSM) by vibration-controlled transient elastography, and Fibroscan-AST (FAST), are frequently used for risk stratification in NAFLD. The comparative performance of FIB-4 and LSM and FAST to predict clinical outcomes of patients with NAFLD remained unclear. We aim to evaluate the performance of FIB-4, LSM, and FAST scores to predict clinical outcomes in patients with NAFLD. Methods: We included consecutive adult patients with NAFLD with transient elastography performed between 2015 and 2022 from the United States and Singapore. Patients with NAFLD stratified based on baseline FIB-4, LSM, and FAST score were followed up until clinical outcomes notably liver-related events (LREs), LREs or death, death, and major adverse cardiac events. Results: A total of 1262 patients with NAFLD (63% with obesity and 37% with diabetes) with vibration-controlled transient elastography were followed up for median 3.5 years. FIB-4 stratified patients with NAFLD into low-risk (<1.3), intermediate-risk (1.3–2.67), and high-risk (>2.67) in 59.4%, 31.5%, and 9.1%, respectively. No LRE occurred with baseline FIB-4 <1.3, regardless of LSM and FAST score. Higher FIB-4 was associated with a higher risk of LREs within each LSM category. FIB-4 had a higher area under the received operating characteristic curve than LSM or FAST score to predict LRE. Conclusions: In this multicenter international study, FIB-4 and LSM synergistically predicted the risk of LRE. In patients with FIB-4 <1.3, vibration-controlled transient elastography may incorrectly classify up to 10% of the patients as high risk. FIB-4 should be incorporated into risk stratification in NAFLD even among patients who underwent VCTE.


INTRODUCTION
NAFLD affects nearly one-third of the global population [1] and is a leading indication for liver transplantation. [2]However, most patients with NAFLD do not develop decompensated liver disease, highlighting the importance of cost-effective risk stratification strategies so we can effectively identify high-risk patients with NAFLD without overwhelming tertiary care centers with low-risk patients. [3,4]urrent guidelines recommend the sequential use of the Fibrosis-4 (FIB-4) score followed by liver-stiffness measurement (LSM) by vibration-controlled transient elastography (VCTE) to risk-stratify patients with NAFLD. [5]While sequential testing using noninvasive tests (NITs) has been shown to improve the classification of patients with NAFLD into fibrosis stages, [6] NITs frequently yield discordant results and it remains unclear how to interpret such discrepancies.It also remains unclear if the combination of LSM and FIB-4 or LSM alone improves on the performance of FIB-4 alone to predict clinical outcomes in patients with NAFLD.While LSM-based strategies correlate with clinical outcomes among viral-associated and alcohol-associated patients with cirrhosis, [7] such data remain limited among patients with NAFLD. [8]There are also concerns of the lower accuracy of LSM among obese or low-risk patients with NAFLD. [9,10]n addition to risk stratification based on the fibrosis stage, accurately identifying patients with high-risk NASH is important to identify potential clinical trial participants.Traditionally, the diagnosis of NASH requires a liver biopsy. [3]More recently, combinations of NITs have been proposed to identify at-risk patients with NASH beyond stage 2 fibrosis, such as the Fibroscan-AST (FAST) score, [11] even though its external validation remained limited.The prognostic value of FAST score is of great interest because it provided a noninvasive alternative to liver biopsy to evaluate both the inflammatory and fibrosis burden in patients with NAFLD.Moreover, it is not known whether the FAST score predicts clinical events in patients with NAFLD, especially when used in conjunction with FIB-4 or LSM.
Therefore, in this study, we aim to determine the prognostic significance of the FIB-4, LSM, and FAST scores to predict liver-related events (LRE: defined as hepatic decompensation, HCC, or liver transplantation) among patients with NAFLD from Asian and western centers.We also aim to perform sensitivity analysis to determine the prognostic significance of these NITs in predicting death, LRE/death, and major adverse cardiac events (MACEs).

Study population
This is a multicenter, retrospective cohort study of consecutive adults (age above 18 y) with NAFLD from the University of Michigan Health System (United States of America) and Changi General Hospital (Singapore) who underwent VCTE between January 1, 2015, and December 31, 2022.The study was approved by the respective institutional ethics committees with waiver of consent granted, and was conducted in compliance with the Declarations of Helsinki and Istanbul.
NAFLD was diagnosed based on either radiological (ultrasound, CT or MRI) or histological diagnosis of hepatic steatosis, without documented alternative chronic liver disease or significant alcohol intake (defined as >1 U/d in female or > 2 U/d in male).Clinical data were collected using a unified data template.

Outcomes measures
Our primary outcome was the occurrence of the first LREs, which were defined as the occurrence of liver decompensation (variceal bleeding, clinically overt ascites or overt HE), HCC, or liver transplantation. [13]ariceal bleeding was confirmed from endoscopy and consultation reports.Ascites was defined as clinically overt ascites requiring diuretic treatment, large-volume paracentesis, or transjugular intrahepatic shunt placement.Overt HE was defined by West Haven Classification grade 2 and beyond by the managing specialists.We included three secondary outcomes.First, given that most patients with NAFLD do not die of liver disease, [14] we included a composite end point of either LRE or allcause mortality.Second, we included an outcome of all-cause mortality.Third, MACE were defined as a composite end point of myocardial infarction, coronary revascularization, heart failure requiring hospitalization or stroke. [15]All clinical events were manually reviewed for verification.We excluded patients with less than 6 months of follow-up or events that occurred within the first 6 months of the study to avoid misclassifying prevalent disease as incident.

Statistical analysis
We summarized the baseline characteristics of our cohort based on study sites.Continuous data were reported in mean ± SD or median with interquartile range based on normality of data distribution.Categorical data were summarized by frequency (percentage).Numerical and categorical baseline variables comparisons using 2 Sample T or Mann-Whitney U tests and the Chi-square/Fisher exact tests, respectively.
Kaplan-Meier with log-rank test was used to compare the time to event variables across groups.We reported both the cumulative incidence and the incidence rate (reported as events in 1000 per person-year) with the respective 95% CI between different subgroups (FIB-4, LSM, or FAST).The diagnostic statistics of NITs and combinations of NITs to predict clinical outcomes were reported.The time-dependent area under the received operating characteristic curve (tAUC) at 3 years between different NITs was compared using Delong test at various optimal cutoff using (1) Youden Index, (2) sensitivity ≥ 90%, and (3) specificity ≥ 90%. [16,17]ensitivity analysis was performed to compare the tAUC of NITs to predict LRE or LRE/death at 5 years.We estimated the risk of developing LRE using the Fine-Gray competing risk regression, with death as competing risk, and expressed in subdistributional HR (with 95% CI. [18] To determine the performance of NITs in identifying low-risk patients with NAFLD, we compared the misclassification of low-risk NAFLD between NITs by performing the test of marginal homogeneity.Statistical analysis was performed using STATA/SE version 17.0 (StataCorp LLC, USA) and R version 4.1.2(R Foundation for Statistical Computing).

Baseline demographics
A total of 1262 patients with NAFLD were included from the United States and Singapore (Supplemental Figure S1, http://links.lww.com/HC9/A486).The cohort was predominantly White (62%) with a mean age of 52 years (Table 1).The mean (± SD) body mass index was 31.9 (±6.7) kg/m 2 , with 63.1% of the population having obesity and one-third having diabetes mellitus.The median (interquartile range) follow-up was 3.5 (2.4-4.6) years with 4342 person-years of follow-up in total.Despite a lower proportion of patients with NAFLD with obesity (67.9% vs. 46.0%,p < 0.0001) when compared to the US cohort, the Singapore NAFLD cohort was older with more metabolic comorbidities such as diabetes mellitus, hypertension, and hyperlipidemia (p < 0.0001 for all) (Table 1).The Singapore NAFLD cohort also had a lower baseline ALT (52U/L vs. 73U/L, p = 0.004), lower AST (39 U/L vs. 48 U/L, p = 0.004), but a higher serum creatinine (82 mmol/L vs. 78 mmol/L, p = 0.019) than the US cohort.

Identification of low-risk NAFLD using FIB-4, LSM, and FAST score
The individual performance of FIB-4, LSM, and FAST score to identify low-risk NAFLD is summarized in  Supplemental Table 8, http://links.lww.com/HC9/A486.In sequential testing, FIB-4 testing using a cutoff value of 1.3 first identified 59.4% of the patients as low-risk NAFLD, without missing any patients with LRE.In the second step, LSM with a cutoff value of 8 kPa identified 18.2% of the patients as low-risk NAFLD.In other words, sequential FIB-4 and LSM testing identified 77.6% of the cohort as low-risk NAFLD at the expense of missing out 3/27 (11.1%)LRE.Combining FIB-4 and LSM for all patients reduces the LRE to 0%, but the proportion of low-risk patients with NAFLD identified also reduced to 43.1%.FAST score identified a similar proportion of low-risk patients with NAFLD than the combination strategy (42.5%) at the expense of missing more LRE (6/27, 22.2%).These findings support a sequential approach of FIB-4 followed by LSM over the approach of using FIB-4 alone or using performing LSM for everyone.

DISCUSSION
In this international study including 1262 patients with NAFLD followed up over a median of 3 years, we found that FIB-4 has excellent negative predictive value to predict LRE among patients with NAFLD, regardless of LSM.Further, no patients with low FIB-4 developed LREs or death, thus supporting the current guidelines of not performing VCTE among low-risk NAFLD patients even in the secondary or tertiary care setting.The performance of FIB-4 in predicting LRE and death was also similar to another European study involving 1173 patients with NAFLD. [19]ost NAFLD guidelines recommend a sequential approach with FIB-4 followed by LSM in patients with intermediate or high FIB-4 because LSM has higher sensitivity and specificity for advanced fibrosis than FIB-4. [20]However, it is unclear whether FIB-4 is a superior prognostic score than LSM, which is arguably the more clinically relevant question. [21]Further, there are very limited data on how to interpret discordant results, such as high/intermediate FIB-4 with low LSM, or low FIB-4 with high LSM.Current guidelines recommend the use of liver biopsy in the setting of discordant results between FIB-4 and LSM. [5,22]In practice, repeat LSM may be considered if there is concern over liver biopsy or unreliable LSM results due to elevated liver enzymes or high interquartile range.Here, we found that LSM does not outweigh FIB-4: patients with low FIB-4 have an extremely low risk of LREs regardless of LSM.Further, our findings highlight the disadvantages of VCTE in patients with low FIB-4: 89/910 (10%) of the patients with FIB-4 <1.3 had LSM > 12 kPa, yet none of these patients with LSM > 12 kPa had LREs during follow-up, suggesting that 10% of the patients with low FIB-4 are incorrectly identified as high risk based on LSM by VCTE.Similarly, the combination FIB-4/LSM approach demonstrated poorer risk stratification than the sequential approach (Supplemental Table S6, http:// links.lww.com/HC9/A486).
Even in patients undergoing LSM by VCTE following FIB-4, we believe that LSM should not be considered the "superior" test, but rather LSM and FIB-4 should be considered complementary.We found that within each LSM category higher FIB-4 was associated with a dosedependent increase in the incidence rate of LREs, and vice versa.Thus, combinations of NITs provide more prognostic information than individual NITs and FIB-4 has value even in patients who have undergone more specialized fibrosis assessment.The impact of FAST on LREs in patients with NAFLD has not to our knowledge been previously studied.FAST was designed as a noninvasive approach to identify patients with high-risk NASH (ie, NASH plus significant fibrosis) who may benefit from pharmacologic treatment, whereas FIB-4 and LSM were originally developed as noninvasive metrics of fibrosis stage, without accounting for "disease activity." [11]We found that FAST score was associated with LREs, but this association was relatively weak with lower tAUC than FIB-4 or LSM.These findings can be interpreted in two ways.First, we showed that FAST is measuring a clinically relevant parameter in that patients with higher FAST scores were more likely to develop LREs than those with low FAST.Second, consistent with prior literature on histologically defined NAFLD, steatohepatitis (as defined by FAST) is less predictive of adverse events than fibrosis stage (as defined by FIB-4 or LSM). [3]Of note, our follow-up period was relatively short, while the effects of FAST-defined NASH may accumulate over time, FAST may have a greater impact after prolonged follow-up.In addition, whether patients with high FAST are more likely to respond to treatment than those with lower FAST scores is not known.Further studies will be required to understand the potential applications of the FAST score.
Our findings were contrary to an American study including 81,108 patients with NAFLD diagnosed using ICD code, which suggested that FIB-4 was an independent predictor of MACE. [23]The difference in result is likely related to the younger age of patients with NAFLD (52 vs. 62 years) and lower rates of MACE (0.7% vs. 13.5%) in our cohort.Our findings were similar to the NASH Clinical Research Network cohort study, showing FIB-4 score was not associated with a higher incidence of MACE. [3]Collectively, these findings suggest more data are needed before FIB-4 can be used to stratify MACE among NAFLD in a routine clinical setting.
Strengths of the study include the use of consecutive patients with NAFLD undergoing LSM in 2 countries and the use of hard clinical outcomes rather than surrogate measures of disease.We believe the diagnosis of NAFLD using radiological imaging is more accurate than using ICD code alone in other studies. [24,25]All the clinical events were manually verified through chart review and validated with high accuracy.Limitations include that our cohorts were derived from secondary/tertiary care centers, though this limitation is intrinsic to nearly all real-world studies of LSM since VCTE is rarely done in a primary care setting.Due to the retrospective study design, we were unable to rule out excess alcohol intake not  documented in the medical records or to fully assess baseline cardiac risk.To conclude, FIB-4 has excellent negative predictive value to identify patients with NAFLD with low risk of LRE to be monitored in primary care setting.Our findings support the sequential approach of FIB-4 followed by LSM by VCTE recommended by most international guidelines and highlight the disadvantages of routine VCTE in patients with low FIB-4.In contrast, in higher-risk groups, the combination of FIB-4 and LSM can risk stratifying patients with NAFLD at risk of LRE beyond FIB-4 or LSM alone, with a high risk of LRE in patients with concordantly high FIB-4 and LSM.

FIB- 4 , 5 T A B L E 2 4 < 1 .
LSM and FAST in NAFLD | The 5-year cumulative incidence of liver-related events and death stratified based on FIB-4, FAST score, and liver-stiffness measurement FIB-
Incidence rate is shown as events in 1000 per person-years (95% CI) in the overall cohort.Person-years was rounded to the nearest.reflects differences between cumulative incidence between different subgroups of FIB-4, LSM, and/or FAST using Fisher-exact test.5-year cumulative incidence is shown as events/number at risk (%, 95% CI).
Note:a p-value tAUC, time-dependent area under the received operating curve.Abbreviations: FAST, Fibroscan-AST; FIB-4, Fibrosis index of 4 factors; LSM, liver-stiffness measurement; NPV, negative predictive value; PPV, positive predictive value; tAUC, time-dependent area under the operative characteristic curve.FIB-4, LSM and FAST in NAFLD Note: tAUC was compared using the Delong test.a